Merge master into feature/host-network-device-ordering #6486

changlei-li · 2025-05-27T01:35:40Z

No description provided.

Migration spawns 2 operations which depend on each other so we need to ensure there is always space for both of them to prevent a deadlock. Adding VM_receive_memory to a new queue ensures that there will always be a worker for the receive operation so the paired send will never be blocked. Signed-off-by: Steven Woods <steven.woods@cloud.com>

We've seen that using the policy can be up to 10% faster than using any is some workflows, while not observing workflows that were negatively affected. The policy per VM can always be change if need be. Note that currently sometime the best-effort falls back to the same behaviour, especially when restarting on starting more than one VM at a time. This needs xen patches to be fixed: https://lore.kernel.org/xen-devel/20250314172502.53498-1-alejandro.vallejo@cloud.com/T/#ma1246e352ea3cce71c7ddc26d1329a368548b3b2 Now the deprecated numa-placement configuration option for xenopsd does nothing. It was exclusively used to enable Best_effort, since now it's the default, there's no point in setting the option. It's value depends on whether the default option is best_effort or not, as per the spec. Signed-off-by: Pau Ruiz Safont <pau.ruizsafont@cloud.com>

This change introduces a new `repository_domain__blocklist` that lists repo URL patterns to be blocked. On XAPI startup, any exsiting pool repository whose URLs matches an entry in this blocklist will be automatically removed. This ensures that, for example, when upgrading from XS8 to XS9, any XS8 repos are purged. Additionally, repository creating now check the same blocklist and rejects any attempt to add a blocked repo. - On startup: read blocklist, delete matching blocked repos - On repository creation: validate against blocklist and abort if matched Signed-off-by: Stephen Cheng <stephen.cheng@cloud.com>

…g. (#6475) This change introduces a new `repository_domain_name_blocklist` that lists repo URL patterns to be blocked. On XAPI startup, any exsiting pool repository whose URLs matches an entry in this blocklist will be automatically removed. This ensures that, for example, when upgrading from XS8 to XS9, any XS8 repos are purged. Additionally, repository creating now check the same blocklist and rejects any attempt to add a blocked repo. - On startup: read blocklist, delete matching blocked repos - On repository creation: validate against blocklist and abort if matched Tests: - Create repo with the blocklist configured ![image](https://github.com/user-attachments/assets/8c77b76e-27ef-4184-b5aa-c68b6ee7b9c4) - Create repo without the blocklist configured ![image](https://github.com/user-attachments/assets/89cebfb8-431d-4690-a667-9ecad6730ba8) - With the blocklist, restart xapi and the repo was removed `[root@eu1-dt013 yum.repos.d]# xe repository-list`

We've seen that using the policy can be up to 10% faster than using any is some workflows, while not observing workflows that were negatively affected. The policy per VM can always be change if need be. Note that currently sometime the best-effort falls back to the same behaviour, especially when restarting on starting more than one VM at a time. This needs xen patches to be fixed: https://lore.kernel.org/xen-devel/20250314172502.53498-1-alejandro.vallejo@cloud.com/ Also fix the legacy numa-placement configuration option for xenopsd. It was always deciding the setting, even when not used, not it only takes effect when it's present, otherwise it leaves the default option untouched.

Signed-off-by: Vincent Liu <shuntian.liu2@cloud.com>

These two functions are the new SMAPIv3 functions that will enable mirror and query of the mirror status. So implement them in xapi-storage-script. The SMAPIv1 counterparts remain unimplemented. Signed-off-by: Vincent Liu <shuntian.liu2@cloud.com>

The similar VDI functionality is uncurrently unused for SMAPIv3 migration so just add a dummy implementation. Signed-off-by: Vincent Liu <shuntian.liu2@cloud.com>

This is the main commit that implements the MIRROR interface in storage_smapiv3_migrate. The exact detail of how SMAPIv3 mirror is done is left in the SXM documentation, but core of it is to provide all the necessary infrastructure to able to call the `Data.mirror` SMAPIv3 call that will mirror a VDI to another. Signed-off-by: Vincent Liu <shuntian.liu2@cloud.com>

dummy_vdi and parent_vdi are not created by storage_smapiv3_migrate.receive_start2, so do not attempt to destroy them in storage_smapiv3_migrate.receive_cancel2. Signed-off-by: Vincent Liu <shuntian.liu2@cloud.com>

This is to mimic the behaviour on SMAPIv1. The update_snapshot_info function that runs at the end of migration will check for content_id, and this is needed to make it happy. Signed-off-by: Vincent Liu <shuntian.liu2@cloud.com>

Whilst it is not the default behaviour on XS 8 to attach a VDI through NBD, SXM inbound into a SMAPIv1 SR needs to have nbd enabled for mirroring purposes. As tapdisk will return usable nbd parameters to xapi, they can be included in the return value of attach. Most current users of this return value will keep using blktap2 kernel device and this nbd information is only used during SXM. Signed-off-by: Vincent Liu <shuntian.liu2@cloud.com>

As this nbd proxy is used for importing data, call it `import_nbd_proxy` to distinguish with the `export_nbd_proxy` that will be introduced later on. Signed-off-by: Vincent Liu <shuntian.liu2@cloud.com>

This is a bit of a layering violation as storage_mux should not care about the version of SMAPI the SR is, nor should it be responsible for calling hook functions. But as there is no way for xapi-storage-script to invoke code in xapi (which would also be a layering violation if it was possible), and smapiv1_wrapper has special state tracking logic for determining whether the hook should be called. Leave the hook here for now. Note the pre_deactivate_hook is not called as currently that remains a noop for SMAPIv3. And as we do not support VM shutdown during outbound SXM for SMAPIv3 anyway, leave a hack in the storage_mux for now until we have a plan on how to support that. Signed-off-by: Vincent Liu <shuntian.liu2@cloud.com>

The attach and activate of the VDI being live migrated is there so that the SXM can keep working even if the VM on which the VDI is activated shutsdown. This is possible on SMAPIv1 as tapdisk does not distinguish between different domain paramters. But that is not the case for SMAPIv3. For now just avoid activating the VDI on dom0 since the VM is already activated on the live_vm. This does mean that SXM will stop working if the VM is shut down during storage migration. We will leave that case in the future. Signed-off-by: Vincent Liu <shuntian.liu2@cloud.com>

There is a mirror_checker/tapdisk_watchdog for SMAPIv1 that periodically checks the status of the mirror and sends an update if it detects a failure. Implement something similar for SMAPIv3 mirror, although this check happens for a shorter period of time compared to the SMAPIv1 tapdisk_watchdog because the `Data.stat` call will stop working once the VM is paused, and currently we have no easy way to terminate this mirror checker just before the VM is paused (in xenopsd). So only do this check whilst the mirror syncing is in progress, i.e. when we are copying over the existing disk content. Signed-off-by: Vincent Liu <shuntian.liu2@cloud.com>

Previously the tapdisk watchdog in SMAPIv1 mirroring was cancelled in the `post_deactivate_hook`, but at that point the VDI has already been deactivated, and hence the mirror would have been terminated. Additionally, the last time the stats is retrieved is in `pre_deactivate_hook`, so do this cancelling after the last stats retrival. Note that SMAPIv3 mirror does not have a watchdog due to the limitations of the mirror job auto cancel after guest pause, so instead the mirror checking is only done whilst the mirror syncing (i.e. copying existing disk content) is in progress. Signed-off-by: Vincent Liu <shuntian.liu2@cloud.com>

Signed-off-by: Vincent Liu <shuntian.liu2@cloud.com>

This is a continuation of #6439 in the effort of implementing outbound SXM for SMAPIv3. We have reached the climatic point of this and can actually now implement the logic to do outbound SXM for SMAPIv3 SRs! This is a rather large PR and I expect it to take some time to be reviewed and merged, so I am opening this early to gather some feedback. Since #6439 is not yet merged and this PR depends on that one, I am marking this one as a draft, while reviewing please ignore the first three commit in this PR and look at #6439 first instead. I will update this one again when #6439 is merged There is also a couple of docs PR at the end documenting the design and approach taken in doing SMAPIv3 migration. In terms of testing plan, the important thing is to make sure for now that this is not regressing the SMAPIv1 migration. For that I will be using the SXM functional tests suite. I will also be using more tests to actually test the SMAPIv3 SXM feature.

In commit 2eff6ab, the http handler was renamed to add an "import" in the url, but we need to keep the previous one for backwards compatability. This is so that previous versions of sparse_dd in XS 8 can migrate to the latest one. Signed-off-by: Vincent Liu <shuntian.liu2@cloud.com>

RRD loop is executed each 5 seconds. It delays fixed 5 seconds between each loop. But the loop self also consumes time (The time consuming depends on CPU's count. If there are many CPUs, the time consuming may be hundreds milliseconds). This implementation leads RRD will take an offset after several loops. Then one of RRD data lose and a gap can be observed on XenCenter performance graph. The solution is to use a fixed deadline as each iteration start time and to use a computed delay (timeslice - loop time consuming) instead of fixed delay. Signed-off-by: Bengang Yuan <bengang.yuan@cloud.com>

RRD loop is executed each 5 seconds. It delays fixed 5 seconds between each loop. But the loop self also consumes time (The time consuming depends on CPU's count. If there are many CPUs, the time consuming may be hundreds milliseconds). This implementation leads RRD will take an offset after several loops. Then one of RRD data lose and a gap can be observed on XenCenter performance graph. The solution is to use computed delay (timeslice - loop time consuming) instead of fixed delay.

When the customers open "Migrate VM Wizard" on XenCenter, XenCenter will call `VM.assert_can_migrate` to check each host in each pool connected to XenCenter if the VM can be migrated to it. The API `VM.assert_can_migrate` then calls `VM.export_metadata`. `VM.export_metadata` will lock VM. During this time, other `VM.export_metadata` requests will fail as they can't get VM lock. The solution is to add retry when failing to lock VM. Signed-off-by: Bengang Yuan <bengang.yuan@cloud.com>

…#6470) Migration spawns 2 operations which depend on each other so we need to ensure there is always space for both of them to prevent a deadlock during localhost and two-way migrations. Adding VM_receive_memory to a new queue ensures that there will always be a worker for the receive operation so the paired send will never be blocked. This will increase the total number of workers by worker-pool-size. Unlike parallel_queues workers, these workers will be doing actual work (VM_receive_memory), which could in theory increase the workload of a host if it is receiving VMs at the same time as other work, so this needs to be considered before merging this PR.

Sorry folks, looks like we need an epilogue of #6457 forget about this backwards compatability issue. Backwards compatability is hard...

When the customers open "Migrate VM Wizard" on XenCenter, XenCenter will call `VM.assert_can_migrate` to check each host in each pool connected to XenCenter if the VM can be migrated to it. The API `VM.assert_can_migrate` then calls `VM.export_metadata`. `VM.export_metadata` will lock VM. During this time, other `VM.export_metadata` requests will fail as they can't get VM lock. The solution is to add retry when failing to lock VM.

snwoods and others added 29 commits May 20, 2025 11:38

Minor doc fix

222f407

Signed-off-by: Vincent Liu <shuntian.liu2@cloud.com>

Add helper functions to parse nbd info

f2037bd

Signed-off-by: Vincent Liu <shuntian.liu2@cloud.com>

Introduce DATA.mirror and DATA.stat

97ecbf4

These two functions are the new SMAPIv3 functions that will enable mirror and query of the mirror status. So implement them in xapi-storage-script. The SMAPIv1 counterparts remain unimplemented. Signed-off-by: Vincent Liu <shuntian.liu2@cloud.com>

Add dummy implementation for VDI.similar_content for SMAPIv3

30b8355

The similar VDI functionality is uncurrently unused for SMAPIv3 migration so just add a dummy implementation. Signed-off-by: Vincent Liu <shuntian.liu2@cloud.com>

Multiplex receive_cancel2

2364acc

dummy_vdi and parent_vdi are not created by storage_smapiv3_migrate.receive_start2, so do not attempt to destroy them in storage_smapiv3_migrate.receive_cancel2. Signed-off-by: Vincent Liu <shuntian.liu2@cloud.com>

Preserve content_id when doing snapshot on SMAPIv3

d91e2ba

This is to mimic the behaviour on SMAPIv1. The update_snapshot_info function that runs at the end of migration will check for content_id, and this is needed to make it happy. Signed-off-by: Vincent Liu <shuntian.liu2@cloud.com>

Update the name of the nbd proxy

2eff6ab

As this nbd proxy is used for importing data, call it `import_nbd_proxy` to distinguish with the `export_nbd_proxy` that will be introduced later on. Signed-off-by: Vincent Liu <shuntian.liu2@cloud.com>

doc: Move SXM docs to its own dir

b62e9fb

Signed-off-by: Vincent Liu <shuntian.liu2@cloud.com>

doc: Add doc on how SMAPIv1 SXM works

2514621

Signed-off-by: Vincent Liu <shuntian.liu2@cloud.com>

doc: Add doc for SMAPIv3 SXM

5a0babd

Signed-off-by: Vincent Liu <shuntian.liu2@cloud.com>

SXM: Keep previous http handler for back-compat (#6482)

4f2f185

Sorry folks, looks like we need an epilogue of #6457 forget about this backwards compatability issue. Backwards compatability is hard...

BengangY approved these changes May 27, 2025

View reviewed changes

minglumlu approved these changes May 27, 2025

View reviewed changes

changlei-li merged commit c3ed9ca into feature/host-network-device-ordering May 27, 2025
122 of 125 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Merge master into feature/host-network-device-ordering #6486

Merge master into feature/host-network-device-ordering #6486

Uh oh!

changlei-li commented May 27, 2025

Uh oh!

Uh oh!

Uh oh!

Merge master into feature/host-network-device-ordering #6486

Merge master into feature/host-network-device-ordering #6486

Uh oh!

Conversation

changlei-li commented May 27, 2025

Uh oh!

Uh oh!

Uh oh!